Anonymity: Formalisation of Privacy – k-anonymity

نویسنده

  • Janosch Maier
چکیده

Microdata is the basis of statistical studies. If microdata is released, it can leak sensitive information about the participants, even if identifiers like name or social security number are removed. A proper anonymization for statistical microdata is essential. K-anonymity has been intensively discussed as a measure for anonymity in statistical data. Quasi identifiers are attributes that might be used to identify single participating entities in a study. Linking different tables can leak sensitive information. Therefore k-anonymity requires that each combination of values for the quasi identifiers appears at least k times in the data. When subsequent data is released certain limitations have to be followed for the complete data to adhere to k-anonymity. In this paper, we depict the anonymity level of k-anonymity. We show, how l-diversity and t-closeness provide a stronger level of anonymity as k-anonymity. As microdata has to be anonymized, free toolboxes are available in the internet to provide k-anonymity, l-diversity and t-closeness. We present the Cornell Anonymization Toolkit and the UTD Anonymization Toolbox. Together with Kern, we analyzed geodata gathered from android devices due to its anonymity level. Therefore, we transferred the data into an sqlite database for easier data manipulation. We used SQL-queries to show how this data is not anonymous. We provide a value generalization hierarchy based on the attributes model, device, version and network. Using the UTD Anonymization Toolbox, we transferred the data into a k-anonymous state. For different values of k there are different possibilities of generalizations. We show parts of a 3-anonymous version of the input data in this paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Univariate Microaggregation for Integer Values

Privacy issues during data publishing is an increasing concern of involved entities. The problem is addressed in the field of statistical disclosure control with the aim of producing protected datasets that are also useful for interested end users such as government agencies and research communities. The problem of producing useful protected datasets is addressed in multiple computational priva...

متن کامل

Enhanced P-Sensitive K-Anonymity Models for Privacy Preserving Data Publishing

Publishing data for analysis from a micro data table containing sensitive attributes, while maintaining individual privacy, is a problem of increasing significance today. The k-anonymity model was proposed for privacy preserving data publication. While focusing on identity disclosure, k-anonymity model fails to protect attribute disclosure to some extent. Many efforts are made to enhance the k-...

متن کامل

Research on Privacy Preserving on K-anonymity

The disclosure of sensitive information has become prominent nowadays; privacy preservation has become a research hotspot in the field of data security. Among all the algorithms of privacy preservation in data mining, K-anonymity is a kind of common and valid algorithm in privacy preservation, which can effectively prevent the loss of sensitive information under linking attacks, and it is widel...

متن کامل

Generating Microdata with P -Sensitive K -Anonymity Property

Existing privacy regulations together with large amounts of available data have created a huge interest in data privacy research. A main research direction is built around the k-anonymity property. Several shortcomings of the k-anonymity model have been fixed by new privacy models such as p-sensitive k-anonymity, l-diversity, (α, k)-anonymity, and t-closeness. In this paper we introduce the Enh...

متن کامل

Multi-dimensional k-anonymity Based on Mapping for Protecting Privacy

Data release has privacy disclosure risk if not taking any protection policy. Although attributes that clearly identify individuals, such as Name, Identity Number, are generally removed or decrypted, attackers can still link these databases with other released database on attributes (Quasi-identifiers) to re-identify individual’s private information. K-anonymity is a significant method for priv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013